Download Efficiently Computable Similarity Measures for Query by Tapping Systems
A Query by Tapping system is a database which contains metadata descriptions of songs. The database can be scanned by tapping the melody line’s rhythm of a song requested on a MIDI keyboard or an e-drum. For the processing of queries the system computes the similarity of the query and the content inside the database by applying a similarity measure. Due to the high number of comparison processes in large databases efficiently computable similarity measures are needed. This paper presents two efficiently computable similarity measures which evaluate rhythmic properties of monophonic melodies represented in an MPEG-7 compliant manner. The usage and effectiveness is presented and evaluated with the real time capable Query by Tapping system BeatBank.
Download FEAPI: a low level feature extraction plugin API
This paper presents FEAPI, an easy-to-use platform-independent plugin application programming interface (API) for the extraction of low level features from audio in PCM format in the context of music information retrieval software. The need for and advantages of using an open and well-defined plugin interface are outlined in this paper and an overview of the API itself and its usage is given.
Download Granular Resynthesis for Sound Unmixing
In modern music genres like Pop, Rap, Hip-Hop or Techno many songs are built in a way that a pool of small musical pieces, so called loops, are used as building blocks. These loops are usually one, two or four bars long and build the accompaniment for the lead melody or singing voice. Very often the accompanying loops can be heard solo in a song at least once. This can be used as a-priori knowledge for removing these loops from the mixture. In this paper an algorithm based on granular resynthesis and spectral subtraction is presented which makes use of this a-priori knowledge. The algorithm uses two different synthesis strategies and is capable of removing known loops from mixtures even if the loop signal contained in the mixture signal is slightly different from the solo loop signal.
Download The REACTION System: Automatic Sound Segmentation and Word Spotting for Verbal Reaction Time Tests
Reaction tests are typical tests from the field of psychological research and communication science in which a test person is presented some stimulus like a photo, a sound, or written words. The individual has to evaluate the stimulus as fast as possible in a predefined manner and has to react by presenting the result of the evaluation. This could be by pushing a button in simple reaction tests or by saying an answer in verbal reaction tests. The reaction time between the onset of the stimulus and the onset of the response can be used as a degree of difficulty for performing the given evaluation. Compared to simple reaction tests verbal reaction tests are very powerful since the individual can simply say the answer which is the most natural way of answering. The drawback for verbal reaction tests is that today the reaction times still have to be determined manually. This means that a person has to listen through all audio recordings taken during test sessions and mark stimuli times and word beginnings one by one which is very time consuming and people-intensive. To replace the manual evaluation of reaction tests this article presents the REACTION (Reaction Time Determination) system which can automatically determine the reaction times of a test session by analyzing the audio recording of the session. The system automatically detects the onsets of stimuli as well as the onsets of answers. The recording is furthermore segmented into parts each containing one stimulus and the following reaction which further facilitates the transcription of the spoken words for a semantic evaluation.